Rough Set Analysis of Preference-Ordered Data

نویسندگان

  • Roman Slowinski
  • Salvatore Greco
  • Benedetto Matarazzo
چکیده

The paper is devoted to knowledge discovery from data, taking into account prior knowledge about preference semantics in patterns to be discovered. The data concern a set of situations (objects, states, examples) described by a set of attributes (properties, features, characteristics). The attributes are, in general, divided into condition and decision attributes, corresponding to input and output of a situation. The situations are partitioned by decision attributes into decision classes. A pattern discovered from the data has a symbolic form of decision rule or decision tree. In many practical problems, some condition attributes are defined on preference-ordered scales and the decision classes are also preference-ordered. The known methods of knowledge discovery ignore, unfortunately, this preference information, taking thus a risk of drawing wrong patterns. To deal with preference-ordered data we propose to use a new approach called Dominance-based Rough Set Approach (DRSA). Given a set of situations described by at least one condition attribute with preference-ordered scale and partitioned into preferenceordered classes, the new rough set approach is able to approximate this partition by means of dominance relations. The rough approximation of this partition is a starting point for induction of “if..., then...” decision rules. The syntax of these rules is adapted to represent preference orders. The DRSA analyses only facts present in data and possible inconsistencies are identified. It preserves the concept of granular computing, however, the granules are dominance cones in evaluation space, and not bounded sets. It is also concordant with the paradigm of computing with words, as it exploits ordinal, and not necessarily cardinal, character of data. 1 How Prior Knowledge Influences Knowledge Discovery? Discovering knowledge from data means being able to find concise classification patterns that agree with situations described by the data. They are useful for explanation of data and for prediction of future situations in such applications as technical diagnostics, performance evaluation or risk assessment. The situations J.J. Alpigini et al. (Eds.): RSCTC 2002, LNAI 2475, pp. 44–59, 2002. c © Springer-Verlag Berlin Heidelberg 2002 Rough Set Analysis of Preference-Ordered Data 45 are described by a set of attributes, called also properties, features, characteristics, etc. The attributes may be either on condition or decision side of the description, corresponding to input or output of a situation. The situations may be objects, states, examples, etc. It will be convenient to call them objects in this paper. The data set in which classification patterns are searched for is called learning sample. Learning of patterns from this sample assumes certain prior knowledge that may include the following items: (i) domains of attributes, i.e. sets of values that an attribute may take while being meaningful for user’s perception, (ii) division of attributes into condition and decision attributes, restricting the range of patterns to functional relations between condition and decision attributes, (iii) preference order in domains of some attributes and semantic correlation between pairs of these attributes, requiring the patterns to observe the dominance principle. In fact, item (i) is usually taken into account in knowledge discovery. With this prior knowledge only, one can discover patterns called association rules [1], showing strong relationships between values of some attributes, without fixing which attributes will be on the condition and which ones on the decision side in all rules. If item (i) is combined with item (ii) in prior knowledge, then one can consider a partition of the learning sample into decision classes defined by decision attributes. The patterns to be discovered have then the form of decision trees or decision rules representing functional relations between condition and decision attributes. These patterns are typically discovered by machine learning and data mining methods [19]. As there is a direct correspondence between decision tree and rules, we will concentrate further our attention on decision rules. As item (iii) is crucial for this paper, let us explain it in more detail. Consider an example of data set concerning pupils’ achievements in a high school. Suppose that among attributes describing the pupils there are results in mathematics (Math) and physics (Ph), and a general achievement (GA). The domains of these attributes are composed of three values: bad, medium and good. This information constitutes item (i) of prior knowledge. The preference order of the attribute values is obvious: good is better thanmedium and bad, andmedium is better than bad. It is known, moreover, that Math is semantically correlated with GA, as well as Ph with GA. This is, precisely, item (iii) of prior knowledge. Attributes with preference-ordered domains are called criteria in decision theory. We will use the name of regular attributes for those attributes whose domains are not preferenceordered. Semantic correlation between two criteria means that an improvement on one criterion should not worsen evaluation on the second criterion. In our example, improvement of a pupil’s score in Math or Ph, with other attribute values unchanged, should not worsen pupil’s general achievement GA, but rather improve it. 46 R. S5lowiński, S. Greco, and B. Matarazzo What classification patterns can be drawn from the pupils’ data set? If prior knowledge includes items (i) and (iii) only, then association rules can be induced; if item (ii) is known in addition to (i) and (iii), then decision rules can be induced. The next question is: how item (iii) influences association rules and decision rules? It has been specified above that item (iii) requires the patterns to observe the dominance principle. The dominance principle (called also Pareto principle) should be observed by (association and decision) rules having at least one pair of semantically correlated criteria spanned over condition and decision part. Each rule is characterized by condition profile and decision profile, corresponding to vectors of threshold values of attributes in condition and decision part of the rule, respectively. We say that one profile dominates another if they both involve the same attributes and the criteria values of the first profile are not worse than criteria values of the second profile, while the values of regular attributes in both profiles are indiscernible. The dominance principle requires the following: consider two rules, r and s, involving the same regular attributes and criteria, such that each criterion used in the condition part is semantically correlated with at least one criterion present in the decision part of these rules; if condition profile of rule r dominates condition profile of rule s, then the decision profile of rule r should also dominate decision profile of rule s. Suppose that two rules induced from the pupils’ data set relate Math and Ph on the condition side, with GA on the decision side: rule #1: if Math=medium and Ph=medium, then GA=good, rule #2: if Math=good and Ph=medium, then GA=medium, The two rules do not observe the dominance principle because the condition profile of rule #2 dominates the condition profile of rule #1, while the decision profile of rule #2 is dominated by the decision profile of rule #1. Thus, in the sense of the dominance principle the two rules are inconsistent, that is they are wrong. One could say that the above rules are true because they are supported by examples of pupils from the learning sample, but this would mean that the examples are also inconsistent. The inconsistency may come from many sources, e.g.: – missing attributes (regular ones or criteria) in the description of objects; maybe the data set does not include such attribute as opinion of pupil’s tutor (OT ) expressed only verbally during assessment of pupil’s GA by school teachers’ council, – unstable preferences of decision makers; maybe the members of school teachers’ council changed their view on influence of Math on GA during the assessment. Handling these inconsistencies is of crucial importance for knowledge discovery. They cannot be simply considered as noise or error to be eliminated from data, or amalgamated with consistent data by some averaging operators, but they should be identified and presented as uncertain patterns. Rough Set Analysis of Preference-Ordered Data 47 If item (iii) would be ignored in prior knowledge, then the handling of above mentioned inconsistencies would be impossible. Indeed, there would be nothing wrong in rules #1 and #2: they are supported by different examples discerned by considered attributes. It has been acknowledged by many authors that rough set theory provides excellent framework for dealing with inconsistency in knowledge discovery [18, 20, 21, 22, 24, 27, 29, 30]. The paradigm of rough set theory is that of granular computing, because the main concept of the theory – rough approximation of a set – is build up of blocks of objects indiscernible by a given set of attributes, called granules of knowledge. In space of regular attributes, the granules are bounded sets. Decision rules induced from rough approximation of a classification are also build up of such granules. While taking into account prior knowledge of type (i) and (ii), the rough approximation and the inherent rule induction ignore, however, prior knowledge of type (iii). In consequence, the resulting decision rules may be inconsistent with the dominance principle. The authors have proposed an extension of the granular computing paradigm that permits taking into account prior knowledge of type (iii), in addition to either (i) only [17], or (i) and (ii) together [5, 6, 10, 13, 15, 26]. Combination of the new granules with the idea of rough approximation makes the, so-called, Dominance-based Rough Set Approach (DRSA). In the following sections we present the concept of granules permitting to handle prior knowledge of type (iii), then we briefly sketch DRSA and its main extensions; as sets of decision rules resulting from DRSA can be seen as preference models in multicriteria decision problems, we briefly comment this issue; application of the new paradigm of granular computing to induction of association rules is also mentioned before conclusions. 2 How Prior Knowledge about Preference Order in Data Influences the Granular Computing? In other words, how should be defined the granule of knowledge in the attribute space in order to take into account prior knowledge about preference order in data when searching for rules? As it is usual in knowledge discovery methods, information about objects is represented in a data table, in which rows are labelled by objects and contain the values of attributes for each corresponding object, whereas columns are labelled by attributes and contain the values of each corresponding attribute for the objects. Let U denote a finite set of objects (universe) and Q a finite set of attributes divided into set C of condition attributes and set D of decision attributes; C ∩D = ∅. Let also XC = |C| ∏ q=1 Xq and XD = |D| ∏ q=1 Xq be attribute spaces corresponding to sets of condition and decision attributes, respectively. Elements of XC and XD can be interpreted as possible evaluation of objects on attributes from set C={1,. . . ,|C|} and from set D={1,. . . ,|D|}, respectively. Therefore, Xq 48 R. S5lowiński, S. Greco, and B. Matarazzo is the set of possible evaluations of considered objects with respect to attribute q. Value of object x on attribute q ∈ Q is denoted by xq. Objects x and y are indiscernible by P ⊆ C if xq = yq for all q ∈ P and, analogously, objects x and y are indiscernible by R ⊆ D if xq = yq for all q ∈ R. Sets of indiscernible objects are equivalence classes of the corresponding indiscernibility relation IP or IR. Moreover, IP (x) and IR(x) denote equivalence classes including object x. ID makes a partition of U into a finite number of decision classes Cl={Cl t, t ∈ T}, T={1,...,n}. Each x ∈ U belongs to one and only one class Cl t ∈Cl . The above definitions take into account prior knowledge of type (i) and (ii) only. In this case, the granules of knowledge are bounded sets in XP and XR (P ⊆ C and R ⊆ D), defined by partitions of U induced by indiscernibility relations IP and IR, respectively. Then, classification patterns to be discovered are functions representing granules IR(x) by granules IP (x) in condition attribute space XP , for any P ⊆ Cand any x ∈ U . If prior knowledge includes item (iii) in addition to (i) and (ii), then indiscernibility relation is unable to produce granules in XC and XD taking into account the preference order. To do so, it has to be substituted by dominance relation in XP and XR (P ⊆ C and R ⊆ D). Suppose, for simplicity, that all condition attributes in C and all decision attributes in D are criteria, and that C and D are semantically correlated. Let q be a weak preference relation on U (often called outranking) representing a preference on the set of objects with respect to criterion q ∈{C ∪D}; xq qyq means “xq is at least as good as yq with respect to criterion q”. On the one hand, we say that x dominates y with respect to P ⊆ C (shortly, x P-dominates y) in condition attribute space XP (denotation xDP y) if xq qyq for all q ∈ P . Assuming, without loss of generality, that the domains of criteria are numerical, i.e. Xq ⊆R for any q ∈ C, and that they are ordered such that preference increases with the value, one can say that xDP y is equivalent to: xq ≥ yq for all q ∈ P , P ⊆ C. Observe that for each x ∈ XP , xDPx, i.e. P dominance is reflexive. On the other hand, analogical definition holds in decision attribute space XR (denotation xDRy), R ⊆ D. The dominance relations xDP y and xDRy (P ⊆C and R⊆ D) are directional statements where x is a subject and y is a referent. If x∈XP is the referent, then one can define a set of objects y ∈XP dominating x, called P -dominating set, D P (x)={y ∈ U : yDPx}. If x∈XP is the subject, then one can define a set of objects y∈XP dominated by x, called P -dominated set, D− P (x)={y ∈ U : xDP y}. P -dominating sets D P (x) and P -dominated sets D − P (x) correspond to positive and negative dominance cones in XP , with the origin x. As to decision attribute space XR, R ⊆ D, the R-dominance relation permits to define sets: Cl≥x R ={y ∈ U : yDRx}, Cl≤x R ={y ∈ U : xDRy}. Cltq ={x ∈ XD: xq = tq} is a decision class with respect to q ∈ D. Cl≥x R is called upward union of classes, and Cl≤x R , downward union of classes. If x ∈ Rough Set Analysis of Preference-Ordered Data 49 Cl≥x R , then x belongs to class Cltq , xq = tq, or better on each decision attribute q ∈ R; if x ∈ Cl≤x R , then x belongs to class Cltq , xq = tq, or worse on each decision attribute q ∈ R. The downward and upward unions of classes correspond to positive and negative dominance cones in XR, respectively. In this case, the granules of knowledge are open sets in XP and XR defined by dominance cones D P (x), D − P (x) (P ⊆ C) and Cl≥x R , Cl≤x R (R ⊆ D), respectively. Then, classification patterns to be discovered are functions representing granules Cl≥x R , Cl ≤x R by granules D + P (x), D − P (x), respectively, in condition attribute space XP , for any P ⊆ Cand R ⊆ D and any x ∈ XP . In both cases above, the functions are sets of decision rules. 3 Dominance-Based Rough Set Approach (DRSA) 3.1 Granular Computing with Dominance Cones Suppose, for simplicity, that set D of decision attributes is a singleton, D={d}. Decision attribute d makes a partition of U into a finite number of classes Cl={Cl t, t ∈ T}, T={1,...,n}. Each x ∈ U belongs to one and only one class Cl t ∈Cl . The upward and downward unions of classes boil down, respectively, to:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dominance-based Rough Set Analysis for Uncertain Data Tables

In this paper, we propose a dominance-based rough set approach for the decision analysis of a preference-ordered uncertain data table, which is comprised of a finite set of objects described by a finite set of criteria. The domains of the criteria may have ordinal properties that express preference scales. In the proposed approach, we first compute the degree of dominance between any two object...

متن کامل

A Rough Set-based Approach to Handling Uncertainty in Geographic Data Classification

The chapter describes how the Rough Set-based approximation of polygon classes with point-based elementary sets can lead to classification of point-in-polygon data patterns and consequently to knowledge in terms of classification rules, which are logical statements of the “if . . . , then. . . ” type. The chapter also discusses properties of Rough Set-based approximation when indiscernibility r...

متن کامل

Invited Review Rough sets theory for multicriteria decision analysis

The original rough set approach proved to be very useful in dealing with inconsistency problems following from information granulation. It operates on a data table composed of a set U of objects (actions) described by a set Q of attributes. Its basic notions are: indiscernibility relation on U, lower and upper approximation of either a subset or a partition of U, dependence and reduction of att...

متن کامل

Multicriteria Classification by Dominance-Based Rough Set Approach ♦♦♦ Methodological Basis of the 4eMka System

We are considering multicriteria classification that differs from usual classification problems since it takes into account preference orders in the description of objects by condition and decision attributes. The well-known methods of knowledge discovery do not use the information about preference orders in multicriteria classification. It is worthwhile, however, to take this information into ...

متن کامل

Incremental approximation computation in incomplete ordered decision systems

Approximation computation is a critical step in rough sets theory used in knowledge discovery and other related tasks. In practical applications, an information system often evolves over time by the variation of attributes or objects. Effectively computing approximations is vital in data mining. Dominance-based rough set approach can handle information with preference-ordered attribute domain, ...

متن کامل

A Non-radial rough DEA model

  For efficiency evaluation of some of the Decision Making Units that have uncertain information, Rough Data Envelopment Analysis technique is used, which is derived from rough set theorem and Data Envelopment Analysis (DEA). In some situations rough data alter nonradially. To this end, this paper proposes additive rough–DEA model and illustrates the proposed model by a numerical example.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002